On Theoretically Valid Score Distributions in Information Retrieval
نویسندگان
چکیده
In this paper, we aim to investigate the practical usefulness of the Recall-Fallout Convexity Hypothesis (RFCH) for a number of document score distribution (SD) models. We compare SD models that do not automatically adhere to the RFCH to modified versions of the same SD models that do adhere to the RFCH. We compare these models using the inference of average precision as a measure of utility. For the three models studied in this paper, we conclude that adhering to the RFCH is practically useful for the two-normal model, makes no difference for the two-gamma model, and degrades the performance of the twolognormal model.
منابع مشابه
Score Distributions in Information Retrieval
We review the history of modeling score distributions, focusing on the mixture of normal-exponential by investigating the theoretical as well as the empirical evidence supporting its use. We discuss previously suggested conditions which valid binary mixture models should satisfy, such as the Recall-Fallout Convexity Hypothesis, and formulate two new hypotheses considering the component distribu...
متن کاملOn Score Distributions and Relevance
We discuss the idea of modelling the statistical distributions of scores of documents, classified as relevant or non-relevant. Various specific combinations of standard statistical distributions have been used for this purpose. Some theoretical considerations indicate problems with some of the choices of pairs of distributions. Specifically, we revisit a generalisation of the well-known inverse...
متن کاملUsing Models of Score Distributions in Information Retrieval
Empirical modeling of a number of different text search engines shows that the score distributions on a per query basis may be fitted approximately using an exponential distribution for the set of nonrelevant documents and a normal distribution for the set of relevant documents. This model fits not only probabilistic search engines like INQUERY but also vector space search engines like SMART an...
متن کاملMatching Scores of System Relevance and User-Oriented Relevance in SID, ISC and Google Scholar
Background and Aim: The main aim of Information storage and retrieval systems is keeping and retrieving the related information means providing the related documents with users’ needs or requests. This study aimed to answer this question that how much are the system relevance and User- Oriented relevance are matched in SID, SCI and Google Scholar databases. Method: In this study 15 keywords of ...
متن کاملScore Following and Retrieval Based on Chroma and Octave Representation
With the studies of effective representation of music signals and music scores, i.e. chroma and octave features, this work conducts score following and score retrieval. To complement the shortage of chromagram representation, energy distributions in different octaves are used to describe tone height information. By transforming music signals and scores into sequences of feature vectors, score f...
متن کامل